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Abstract 

Ray Solomonoff invented the notion of universal induction featuring an 
aptly termed "universal" prior probability function over all possible com- 
putable environments [Sol64]. The essential property of this prior was its 
ability to dominate all other such priors. Later, Levin introduced another 
construction — a mixture of all possible priors or "universal mixture" [ZL70]. 
These priors are well known to be equivalent up to multiplicative constants. 
Here, we seek to clarify further the relationships between these three char- 
acterisations of a universal prior (Solomonoff 's, universal mixtures, and uni- 
versally dominant priors). We see that the the constructions of Solomonoff 
and Levin define an identical class of priors, while the class of universally 
dominant priors is strictly larger. We provide some characterisation of the 
discrepancy. 
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1 Introduction 



In the study of universal induction, we consider an abstraction of the world in 
the form of a binary string. Any sequence from a finite set of possibilities can 
be expressed in this way, and that is precisely what contemporary computers are 
capable of analysing. An "environment" provides a measure of probability to (pos- 
sibly infinite) binary strings. Typically, the class A4 of enumerable semimeasures 
is considered. Given the equivalence between Ai and the set of monotone Turing 
machines (Lemma 6), this choice reflects the expectation that the environment can 
be computed by (or at least approximated by) a Turing machine. 

Universal induction is an ideal Bayesian induction mechanism assigning prob- 
abilities to possible continuations of a binary string. In order to do this, a prior 
distribution, termed a universal prior, is defined on binary strings. This prior has 
the property that the Bayesian mechanism converges to the true (generating) envi- 
ronment for any environment /i in A^, given sufficient evidence. 

There are three popular ways of defining a universal prior in the literature: 
Solomonoff's prior [Sol64, ZL70, Hut05], as a universal mixture [ZL70, Hut05, 
Hut07], or a universally dominant semimeasure [Hut05, HutOT]. Briefiy, a uni- 
versally dominant semimeasure is one that dominates every other semimeasure in 
A4 (Definition 9), a universal mixture is a mixture of all semimeasures in Ai with 
non-zero coefficients (Definition 8), and a Solomonoff prior assigns the probability 
that a (chosen) monotone universal Turing machine outputs a string given random 
input (Definition 7). These and other relevant concepts are defined in more detail 
in Section 2. 

Solomonoff's and the universal mixture constructions have been known for many 
years and they are often used interchangeably in textbooks and lecture notes. Their 
equivalence has been shown in the sense that they dominate each other [ZL70, Hut05, 
LV08]. We extend this result in Section 3, showing that they in fact define exactly 
the same class of priors. 

Further, it is trivial to see that both constructions produce universally dominant 
semimeasures. The converse is, however, not true. Universally dominant semimea- 
sures are a larger class. We provide a simple example to demonstrate this in Section 
4. 

These results are relatively undemanding technically, however given their fun- 
damental nature, that they have not to our knowledge been published to date, and 
the relevance to Ray Solomonoff's famous work on universal induction, we present 
them here. 

The following diagram summarises these inclusion relations: 

2 Definitions 

We represent the set of finite/infinite binary strings as B* and B°° respectively, e 
denotes the empty string, xb the concatenation of strings x and b, £{x) the length 
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Universally Dominant 




Universal Mixture ^ Theorem 14 ^ Solomonoff Prior 

Figure 1: 

of a string x. A cylinder set, the set of all infinite binary strings which start with 
some X G B* is denoted T^. 

A string x is said to be a prefix of a string y ii y = xz for some string z. We 
write X ^ y or X \Z y ii X is a. proper substring of y (ie: z ^ e). We denote the 
maximal prefix-free subset of a set of finite strings V by [PJ . It can be obtained by 
successively removing elements that have a prefix in V. The uniform measure of a 
set of strings is denoted \V\ := ^pg|^-pj 2"^*^^^. This is the area of continuations of 
elements of V considered as binary decimal numbers. 

There have been several definitions of monotone Turing machines in the literature 
[LV08], however we choose that which is now widely accepted [Sol64, ZL70, Hut05, 
LV08] and has the useful and intuitive property Lemma 6. 

Definition 1. A monotone Turing machine is a computer with binary (one-way) 
input and output tapes, a bidirectional binary work tape (with read/write heads 
as appropriate) and a finite state machine to determine its actions given input and 
work tape values. The input tape is read-only, the output tape is write-only. 

The definitions of a universal Turing machine in the literature are somewhat 
varied or unclear. Monotone universal Turing machines are relevant here for defining 
the Solomonoff prior. In the algorithmic information theory literature, most authors 
are concerned with the explicit construction of a single reference universal machine 
[Hut05, LV08, Sol64, Tur36, ZL70]. A more general definition is left to a relatively 
vague statement along the lines of "a Turing machine that can emulate any other 
Turing machine" . The definition below reflects the typical construction used and is 
often referred to as universal by adjunction [DHIO, FSW06]. 

Definition 2 (Monotone Universal Turing Machine). A monotone universal Turing 
machine is a monotone Turing machine U for which there exist: 

1. an enumeration {Tj : i G N} of all monotone Turing machines 

2. a computable uniquely decodable self-delimiting code / : N — t- B* 

such that the programs for U that produce output coincide with the set {I{i)p ■ i G 
N, p G B*} of concatenations of I{i) and p, and 

U{I{i)p) = Ti{p) \/ieN,peM* 
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A key concept in algorithmic information theory is the assignment of probabil- 
ity to a string x as the probability that some monotone Turing machine produces 
output beginning with x given unbiased coin flip input. This approach was used by 
Solomonoff to construct a universal prior [Sol64]. To better understand the proper- 
ties of such a function, we will need the concepts of enumerability and semimeasures: 

Definition 3. A function or number cf) is said to be enumerable or lower semi- 
computable (these terms are synonymous) if it can be approximated from below 
(pointwise) by a monotone increasing set {(pi : i G N} of finitely computable func- 
tions/numbers, all calculable by a single Turing machine. We write 0j (p. Finitely 
computable functions/numbers can be computed in finite time by a Turing machine. 

Definition 4. A semimeasure is a "defective" probability measure on the cr- 
algebra generated by cylinder sets in M°°. We write ^{x) for x G B* as shorthand 
for //(Fa;). A probability measure must satisfy /i(e) = 1, = J2b€Ml^(^^)- ^ 

semimeasure allows a probability "gap": fj,{e) < 1 and fj,{x) > '^^^^ fJ^ixb) . M. 
denotes the set of all enumerable semimeasures. 

The following definition explicates the relationship between monotone Turing 
machines and enumerable semimeasures. 

Definition 5 (Solomonoff semimeasure). For each monotone Turing machine T we 
associate a semimeasure 

Ar(x) := ^ 2''^P^ = \T-\x*)\ 

\p:T{p)=x*\ 

where YP\ indicates the maximal prefix-free subset of a set of finite strings V, 
T{p) = X* indicates that x is a prefix of (or equal to) T{p) and £{p) is the length of 
p. If there are no such programs, we set Xt{x) '■= 0. [See [LV08] definition 4.5.4] 

Note that this is the probability that T outputs a string starting with x given 
unbiased coin flip input. To see this, consider the uniform measure given by A(Fp) := 
2-^(p). This is the probability of obtaining p from unbiased coin flips. Xt{x) is the 
uniform measure of the set of programs for T that produce output starting with 
X, ie: the probability of obtaining one of those programs from unbiased coin flips. 
Note also that, since T is monotone, this set consists of a union of disjoint cylinder 
sets {Tp : p G [q : T{q) = x*\}. By dovetailing a search for such programs and an 
lower approximation of the uniform measure A, we can see that Xt is enumerable. 
See Definition 4.5.4 (p.299) and Lemma 4.5.5 (p.300) in [LV08]. 

An important lemma in this discussion establishes the equivalence between the 
set of all monotone Turing machines and the set M of all enumerable semimeasures. 
It is equivalent to Theorem 4.5.2 in [LV08] (page 301) with a small correction: 
Xt{€) = 1 for any T by construction, but fx{e) may not be 1, so this case must be 
excluded. 
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Lemma 6. A semimeasure /u is lower semicomputable if and only if there is a 
monotone Turing machine T such that fi = Xt except on = B°° and fj,{e) is lower 
semicomputable. 

We are now equipped to formally define the 3 formulations for a universal prior: 

Definition 7 (Solomonoff prior). The Solomonoff prior for a given universal mono- 
tone Turing machine U is 

M := Ac; 

The class of all Solomonoff priors we denote Um- 

Definition 8 (Universal mixture). A universal mixture is a mixture ^ with non- 
zero positive weights over an enumeration {z/j : z G N, t'j G A^} of all enumerable 
semimeasures A4: 



We require the weights to be a lower semicomputable function. The mixture f 
is then itself an enumerable semimeasure, i.e. ^ G A^. The class of all universal 
mixtures we denote U^. 

Definition 9 (Universally dominant semimeasure). A universally dominant 
semimeasure is an enumerable semimeasure 6 for which there exists a real num- 
ber > for each enumerable semimeasure jj, satisfying: 



The class of all universally dominant semimeasures we denote Us. 

Dominance implies absolute continuity: Every enumerable semimeasure is abso- 
lutely continuous with respect to a universally dominant enumerable semimeasure. 
The converse (absolute continuity implies dominance) is however not true. 

3 Equivalence between Solomonoff priors and 
universal mixtures 

We show here that every Solomonoff prior M G Um can be expressed as a universal 
mixture (i.e.: M G U^) and vice versa. In other words the class of Solomonoff priors 
and the class of universal mixtures are identical: Um = U^. 

Previously, it was known [ZL70, Hut05, LV08] that a Solomonoff prior M and a 
universal mixture ^ are equivalent up to multiplicative constants 




S{x) > c^/i(x) Vx G M* 




Vx G 1* 
Vx G B* 
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The result we present is stronger, stating that the two classes are exactly identical. 
Again we exclude the M(e) is always one for a Solomonoff prior, but 

^(e) is never one for a universal mixture ^ (as there are yU G with /i(e) < 1). 

Lemma 10. For any monotone universal Turing machine U the associated 
Solomonoff prior M can be expressed as a universal mixture, i.e. there exists an enu- 
meration of the set of enumerable semimeasures M. and computable function 
wq : N — )■ M such that 

M{x) = WiUi{x) Vx G B*\e 

with Ylien'^i — ^ ^'^'^ > Vi G N. In other words the class of Solomonoff priors 
is a subset of the class of universal mixtures: Um ^ U^. 

Proof. We note that all programs that produce output from U are uniquely of the 
form q = I{i)p. This allows us to split the sum in (1) below. 

M{x) = 2^^^'^ 

lq:U(q)=x*i 

= j2 Yl 2-'^'^'^^) (1) 

iGN lp:U{I{i)p)=x*\ 

= Y Y 2~^^^^ 

jGM [p:Ti{p)=x*\ 
iGN 

Clearly 2-'(^(^)) > and is a computable function of i. Since / is a self-delimiting 
code it must be prefix free, and so satisfy Kraft's inequality: 

jGN 

Lemma 6 tells us that the A71 cover every enumerable semimeasure if e is excluded 
from their domain, which shows that XliGN ^Ti{x) is a universal mixture. This 

completes the proof. □ 

Corollary 11. [ZL70] The Solomonoff prior M for a universal monotone Turing 
machine U is universally dominant. Thus, the class of Solomonoff priors is a subset 
of the class of universally dominant lower semicomputable semimeasures: Um ^ Us. 

Proof. From Lemma 10 we have for each v ^ M. there exists j G N with p = X^. 
and for all x G B*: 



M(x) = ^2-'(^«)At,' 



iGN 



> 2-'(^(^'))//(x) 

as required. □ 
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Lemma 12. Every universal mixture ^ is universally dominant. Thus, the class of 
universal mixtures is a subset of the class of universally dominant lower semicom- 
putable semimeasures: '^Us- 

Proof. This follows from a similar argument to that in Corollary 11. □ 

Lemma 13. For every universal mixture ^ there exists a universal monotone Turing 
machine and associated Solomonoff prior M such that 

i{x) = M{x) Vx G B*\e 

In other words the class of universal mixtures is a subset of the class of Solomonoff 
priors: lA^ ^Um- 

Proof. First note that by Lemma 6 we can find (by dovetailing possible repetitions of 
some indicies) parallel enumerations {z^jjigN of Ai and {Tj = A,y.}igM of all monotone 
Turing machines, and computable weight function W(^) with 

Take a computable index and lower approximation (p{i,t) Wi. 

«;, = ^|0(2,t + l)-0(^,t)| (2) 

t 

= ^ 2-'='^ (3) 
i 

i, j I—)- kij computable (4) 

The K-C theorem [Lev71, Sch73, Cha75, DHIO] says that for any computable se- 
quence of pairs {kij G N, Xjj G B*}jjgN with ^ 2"'^'^ < 1, there exists a prefix 
Turing machine P and strings {cTjj G B*} such that 

l[oij) = kij , P{aij) = Tij (5) 

Choosing distinct Tij and the existence of prefix machine P ensures that {cTij} is 
prefix free. We now define a monotone Turing machine U. For strings of the form 
aijp for some i,j: 

U{a,,p) := T,{p) (6) 

For strings not of this form, U produces no output. U inherits monotonicity from 
the Tj, and since {TijigN enumerates all monotone Turing machines, U is universal. 
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The Solomonoff prior associated with U is then: 



\u{x) = \U-\x*)\ 



(7) 
(8) 



(9) 



= y^^wiUi{x) 



(10) 



Cix) 



(11) 



□ 



The main theorem for this section is now triviah 

Theorem 14. The classes Um of Solomonojf priors and of universal mixtures 
are exactly equivalent. In other words, the two constructions define exactly the same 
set of priors: Um = U^- 



4 Not all universally dominant enumerable 
semimeasures are universal mixtures 

In this section, we see that a universal mixture must have a "gap" in the semimeasure 
inequahty greater than c2~^^^^^^^ for some constant c > independent of x, and that 
there are universally dominant enumerable semimeasures that fail this requirement. 
This shows that not all universally dominant enumerable semimeasures are universal 
mixtures. 

Lemma 15. For every Solomonoff prior M and associated universal monotone Tur- 
ing machine U, there exists a real constant c > such that 



where the Kolmogorov complexity K{n) of an integer n is the length of the shortest 
prefix code for n. 

Proof. First, note that M{x) — M{x{]) — M{xl) measures the set of programs U~^{x) 
for which U outputs x and no more. Consider the set 



Proof. Follows directly from Lemma 10 and Lemma 13. 



□ 




V := {ql'p\pe B*, U{p) □ x} 
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where /' is a shortest prefix code for i{x) and g is a program such that U{ql'p) 
executes U{p) until i{x) bits are output, then stops. 

Now, for each r = ql'p G "P we have f/(r) = x since U{p) □ a; and q executes 
U{p) until i{x) bits are output. Thus V C U^^{x) and 

\V\ < \U^\x)\ (12) 

Also V = ql'U^^{x*) := {s = ql'p \ p E U^^{x*)}, and so 

\V\=2^'^'i''^\U~\x*)\ (13) 

combining (12) and (13) and noting that M[x) — M{xO) — M[xl) = \U~^{x)\ and 
M(x) = \U~^{x*)\ we obtain 

M{x) - M{xO) - M{xl) = \U~\x)\ 

> 1^1 

= 2-^(^'')|f/-i(x*)| 
= 2-^(^)2"^(^(^))M(x) 

Setting c := 2-^(«) this proves the result. □ 

Theorem 16. Not all universally dominant enumerable semimeasures are universal 
mixtures: C Us 

Proof. Take some universally dominant semimeasure (5, then define 5'{e) := 
1, 5'(0) = 6'{l) := |, 5'{bx) := |(5(6x) for & G B, x G B*\e. 5' is clearly a universally 
dominant enumerable semimeasure with ^'(0) + = S'{e), and by Lemma 15 it 
is not a universal mixture. □ 



5 Conclusions 

One of Solomonoff's more famous contributions is the invention of a theoretically 
ideal universal induction mechanism. The universal prior used in this mechanism 
can be defined/constructed in several ways. We clarify the relationships between 
three different definitions of universal priors, namely universal mixtures, Solomonoff 
priors and universally dominant semimeasures. We show that the class of universal 
mixtures and the class of Solomonoff priors are exactly the same while the class of 
universally dominant lower semicomputable semimeasures is a strictly larger set. 

We have identified some aspects of the discrepancy between Solomonoff pri- 
ors/universal mixtures and universally dominant lower semicomputable semimea- 
sures, however a clearer understanding and characterisation would be of interest. 

Since universal dominance is all that is needed to prove convergence for universal 
induction [Hut05, Sol78] it is interesting to ask whether the extra properties of 
the smaller class of Solomonoff priors have any positive consequences for universal 
induction. 
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